Detection of ambiguous portions of signal corresponding to OOV words or misrecognized portions of input
نویسندگان
چکیده
One of the key problems for large-vocabulary ASR is the detection of unknown or misrecognized portions of the input. This paper presents results obtained using a local rejection algorithm. The algorithm is derived from the two-pass recognition algorithm by Murveit [3] and is used to detect misrecognized portions based on the number per frame of active words during the second pass. The hypothesis underlying the algorithm is that recognition on unexpected data, i.e. noise or out-of-vocabulary (OOV) words, is likely to result in activation of more words, since no word matches the data well; on the other hand, when the match is good, fewer words should be active. The algorithm was tried on part of the WSJ 5K November 1993 test, in which there were no OOV words (3370 words in total) and on the digit-strings-only Macrophone data (14686 words of which 895 were OOV). The results obtained indicate that our approach is promising, both for the detection of OOV words and misrecognized portions of the input. It may provide the base on which to build tools for dealing with these phenomena. These tools might include dialogue mechanisms based on the list of activated words corresponding to a rejected portion, display mechanisms such as reverse video or rescoring schemes.
منابع مشابه
Spoken Term Detection for Persian News of Islamic Republic of Iran Broadcasting
Islamic Republic of Iran Broadcasting (IRIB) as one of the biggest broadcasting organizations, produces thousands of hours of media content daily. Accordingly, the IRIBchr('39')s archive is one of the richest archives in Iran containing a huge amount of multimedia data. Monitoring this massive volume of data, and brows and retrieval of this archive is one of the key issues for this broadcasting...
متن کاملA hierarchical language model incorporating class-dependent word models for OOV words recognition
A new language model is proposed to cope with the demands for recognizing out-of-vocabulary (OOV) words not registered in the lexicon. This language model is a class N-gram incorporating a set of word models that reflect the statistical characteristics of the phonotactics, which depend on the lexical classes. Utilization of class-dependency enhances recognition accuracy and enables identificati...
متن کاملRobust spoken document retrieval methods for misrecognition and out-of-vocabulary keywords
This paper describes a Japanese spoken document retrieval system that is robust for Out-of-Vocabulary (OOV) words. A standard approach to spoken document retrieval is to automatically transcribe spoken documents into word sequences, which can be directly matched against queries. In this approach, the documents including OOV words and words misrecognized as other words cannot be retrieved. To av...
متن کاملمدلسازی بازشناسی واجی کلمات فارسی
Abstract of spoken word recognition is proposed. This model is particularly concerned with extraction of cues from the signal leading to a specification of a word in terms of bundles of distinctive features, which are assumed to be the building blocks of words. In the model proposed, auditory input is chunked into a set of successive time slices. It is assumed that the derivation of the underly...
متن کاملModelling Out-of-Vocabulary Words for Robust Speech Recognition
This thesis concerns the problem of unknown or out-of-vocabulary (OOV) words in continuous speech recognition. Most of today's state-of-the-art speech recognition systems can recognize only words that belong to some predefined finite word vocabulary. When encountering an OOV word, a speech recognizer erroneously substitutes the OOV word with a similarly sounding word from its vocabulary. Furthe...
متن کامل